A Data Augmentation Prior in Fractional Polynomial Generalized Linear Models

نویسندگان

  • Hojin Moon
  • Steven B. Kim
چکیده

In microbial and chemical risk assessments, careful dose-response modeling is emphasized because a target risk level (probability of infection or illness) is often in the low benchmark response range of 1% to 10%. To address model uncertainty at low doses and enhance diversity and flexibility for model-averaging, a set of fractional polynomial dose-response models can be considered. However, elicitation of an informative prior in Bayesian approach is difficult in those models because their parameters are not interpretable. This paper illustrates a method of elicitating informative prior known using data augmentation prior. INTRODUCTION In microbial risk assessment (MRA), an infectious dose for given risk level p ( 100 ID p ) is the dose that corresponds to 100p % of infection (or illness). Statistically, 100 ID p is the dose that satisfies 100 ( = 1|ID )= p P Y p , where Y is a binary random variable with Y = 1 indicating infection. Assuming a monotonic doseresponse relationship, 100 ID p is unique, and the monotonicity is assumed in our discussion. In chemical risk assessment, a benchmark dose for given risk level p ( 100 BMD p ) is similarly defined. Often, an allowed risk level for population is extremely small, and an accurate estimation of such a small risk of an environmental hazard requires sufficiently large samples. In MRA, a target p often ranges from 0.01 to 0.10, but an experimental range is mostly well above p. In this regard, a downward extrapolation is commonly practiced based on the inference of 100 ID p . Although multiple dose-response models are able to fit the data in an experimental range, estimation of 100 ID p can be very sensitive to a chosen model because each dose-response model behaves uniquely at low doses. In this regard, applications of model-selection criteria and model-averaging methods have been considered by several authors [1-4]. In this manuscript we focus on prior elicitation in fractional polynomial models for Bayesian inference of 100 ID p or 100 BMD p . For the estimation of a BMD, Namata et al. (2008) used seconddegree fractional polynomial (FP) dose-response models to enhance diversity and flexibility in a model space for modelaveraging. They showed that careful modifications of generalized linear models (GLMs) could effectively extend a model space for model-averaing. In this manuscript an FP model modified from a GLM is shortened as FP-GLM. In Bayesian framework, [5] and [6] applied Bayesian model averaging (BMA), proposed by [7] using diffuse priors in estimation of a BMD. Though experienced investigators may have more abundant prior knowledge than merely a flat prior, it seems difficult to properly elicit prior knowledge through the FP-GLMs because the model parameters are not easily interpretable. Furthermore, it is difficult to a priori guess a plausible range of the parameter space as the powers of fractional polynomial vary. To address this difficulty, we apply the method of a data augmentation prior (DAP) as discussed by several authors [8-10]. Tractable prior elicitation in dose-response studies have been practiced from the past. [11] re-parameterized the two parameters in a logistic regression model. In the context of MRA, the first parameter corresponds to the 100 ID p , and the second parameter corresponds to 1 1 = ( = 1| ) d P Y d θ where d1 is the minimum dose level in an experiment. It is a useful technique in a GLM, but the re-parameterization is not as simple in a FP-GLM as in a GLM. [12] used a DAP in a logistic regression for a Bayesian adaptive design for Phase I clinical trials. In this manuscript we describe the procedure for the use of a DAP in FP-GLMs. In addition, we provide some examples to explain how a DAP controls the prior precision of ID10 in any FPGLM, which is not easily achieved by a direct prior specification on the model parameters. A roadmap of our discussion follows. Section 2 includes a brief description of FP-GLMs and introduces the method of a DAP in FP-GLMs. Section 3 provides examples of prior elicitation in dose-response models to highlight the benefits of using a DAP. In Section 4 we apply a DAP to Campylobacter jejuni dataset, and in Section 5 we conclude with brief discussion. Central Moon et al. (2014) Email: JSM Math Stat 1(1): 1005 (2014) 2/7 STATISTICAL METHOD Fractional polynomial generalized linear models We let d > 0 denote a dose, and let θd be the probability of infection (or illness) at dose d. Further, we let F (⋅) and F -1(⋅) denote a cumulative distribution function (CDF) and its inverse function, respectively. Then, a GLM is given by ( ) 1 1 2 = , d d F x θ β β − + (1) where 1 < < β −∞ ∞ , 2 > 0 β , and = log( ) ( , ) d x d ∈ −∞ ∞ . For a FPGLM, we let = log( 1)> 0 d x d + so that 0 d x → as 0 d → and d x →∞ as d →∞ . Then, a second-degree FP-GLM is given by ( ) ( ) 1 1 2 1 2 ( )= , P P d d d F x x θ β β − + (2) where 1 < 0 β , 2 > 0 β , 1 < 0 P , and 2 > 0 P . By the four inequalities, we have the linear predictor such that ( ) ( ) 1 2 0 1 2 lim = P P d d d x x β β →   + −∞     and ( ) ( ) 1 2 1 2 lim = . P P d d d x x β β →∞   + ∞     The general form of second-degree FPs in Equation 2 is modified from the original definition of mth degree FPs. The modification satisfies essential properties of a monotonic doseresponse model. That is, by the definition of a CDF, 0 d θ → as 0 d → and 1 d θ → as d →∞ . [13] suggested that { } 1 2 1 2 ( , ): = 2, 1, 0.5, = 0.5,1,2,3 P P P P − − − is sufficient for practical purpose, and we can create twelve FPGLMs from an original GLM. In this manner, a model space easily enhances its diversity and flexibility for model-averaging. We may consider several link functions including logistic, probit, complementary log-log, and gumbel links. In particular, if we denote the linear predictor by ( ) ( ) 1 2 1 2 , P P d d d x x η β β ≡ + the respective CDFs are 1 ( )=(1 ) F e η η − − + , ( )= ( ) F η η Φ , ( )= 1 exp( ) F eη η − − , or ( )= exp( ) F e η η − − , where ( ) Φ ⋅ denotes the CDF of (0,1) N . A more detail discussion about the FP-GLMs is given by Namata et al. (2008). Data augmentation prior Instead of a direct prior elicitation on model parameters β1 and β2, a DAP induces the prior distribution of (β1, β2) through the prior distribution of probabilities of infection at two arbitrary doses, say 1 0 < d d − . Denoting the probability of infection at dj by = ( = 1| ) d j j P Y d θ for = 1,0 j − , we independently model : Beta( , ) j j j a b θ . Then, we transform 1 0 ( , ) d d θ θ − to 1 2 ( , ) β β using the Jacobian transformation. We define θj as d j θ . The independent Beta priors yield the joint density function 0 1 1 1 0 = 1 ( , ) ( ) (1 ) . a b j j j j j f θ θ θ θ − − − − ∝ − ∏ Since the linear predictor in a GLM is 1 2 j d x η β β ≡ + , the chain rule applies as 1 ( ) ( ) ( ) = = = j j j j j k d j k k j k j F F F x θ η η η η β β η β η − ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ For k = 1,2, and the Jacobian determinant is

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

استفاده از مدل چندجمله‌ای کسری در تعیین عوامل مرتبط با بقای بیماران مبتلا به سرطان معده

Background & Objectives: Cox regression model is one of the statistical methods in survival analysis. The use of smoothing techniques in Cox model makes the more accurate estimates for the parameters. Fractional polynomial is one of these techniques in Cox model. The aim of this study was to assess the effects of prognostic factors on survival of patients with gastric cancer using the fractiona...

متن کامل

Solving large systems arising from fractional models by preconditioned methods

This study develops and analyzes preconditioned Krylov subspace methods to solve linear systems arising from discretization of the time-independent space-fractional models. First, we apply shifted Grunwald formulas to obtain a stable finite difference approximation to fractional advection-diffusion equations. Then, we employee two preconditioned iterative methods, namely, the preconditioned gen...

متن کامل

THE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)

Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes.  Small area estimation is needed  in obtaining information on a small area, such as sub-district or village.  Generally, in some cases, small area estimation uses parametric modeling.  But in fact, a lot of models have no linear relationship between the small area average and the covariat...

متن کامل

Presentation of two models for the numerical analysis of fractional integro-differential equations and their comparison

In this paper, we exhibit two methods to numerically solve the fractional integro differential equations and then proceed to compare the results of their applications on different problems. For this purpose, at first shifted Jacobi polynomials are introduced and then operational matrices of the shifted Jacobi polynomials are stated. Then these equations are solved by two methods: Caputo fractio...

متن کامل

Bayesian Generalized Kernel Models

We propose a fully Bayesian approach for generalized kernel models (GKMs), which are extensions of generalized linear models in the feature space induced by a reproducing kernel. We place a mixture of a point-mass distribution and Silverman’s g-prior on the regression vector of GKMs. This mixture prior allows a fraction of the regression vector to be zero. Thus, it serves for sparse modeling an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014